Bose–Einstein statistics

In statistical mechanics, BoseEinstein statistics (or more colloquially B–E statistics) determines the statistical distribution of identical indistinguishable bosons over the energy states in thermal equilibrium.

Contents

Concept

Fermi–Dirac and Bose–Einstein statistics apply when quantum effects are important and the particles are "indistinguishable". Quantum effects appear if the concentration of particles (N/V) ≥ nq. Here nq is the quantum concentration, for which the interparticle distance is equal to the thermal de Broglie wavelength, so that the wavefunctions of the particles are touching but not overlapping. Fermi–Dirac statistics apply to fermions (particles that obey the Pauli exclusion principle), and Bose–Einstein statistics apply to bosons. As the quantum concentration depends on temperature; most systems at high temperatures obey the classical (Maxwell–Boltzmann) limit unless they have a very high density, as for a white dwarf. Both Fermi–Dirac and Bose–Einstein become Maxwell–Boltzmann statistics at high temperature or at low concentration.

Bosons, unlike fermions, are not subject to the Pauli exclusion principle: an unlimited number of particles may occupy the same state at the same time. This explains why, at low temperatures, bosons can behave very differently from fermions; all the particles will tend to congregate together at the same lowest-energy state, forming what is known as a Bose–Einstein condensate.

B–E statistics was introduced for photons in 1920 by Bose and generalized to atoms by Einstein in 1924.

The expected number of particles in an energy state i  for B–E statistics is:


n_i = \frac{g_i}{e^{(\varepsilon_i-\mu)/kT}-1}

with \varepsilon_i > \mu and where:

ni  is the number of particles in state i
gi  is the degeneracy of state i
εi  is the energy of the ith state
μ is the chemical potential
k is Boltzmann's constant
T is absolute temperature

This reduces to Maxwell–Boltzmann statistics for energies  kT \ll \varepsilon_i-\mu and to Rayleigh-Jeans distribution for  kT \gg \varepsilon_i-\mu , namely 
n_i = \frac{g_i kT}{\varepsilon_i-\mu} .

History

In the early 1920s Satyendra Nath Bose, a professor of University of Dhaka in British India was intrigued by Einstein's theory of light waves being made of particles called photons. Bose was interested in deriving Planck's radiation formula, which Planck obtained largely by guessing. In 1900 Max Planck had derived his formula by manipulating the math to fit the empirical evidence. Using the particle picture of Einstein, Bose was able to derive the radiation formula by systematically developing a statistics of massless particles without the constraint of particle number conservation. Bose derived Planck's Law of Radiation by proposing different states for the photon. Instead of statistical independence of particles, Bose put particles into cells and described statistical independence of cells of phase space. Such systems allow two polarization states, and exhibit totally symmetric wavefunctions.

He developed a statistical law governing the behaviour pattern of photons quite successfully. However, he was not able to publish his work; no journals in Europe would accept his paper, being unable to understand it. Bose sent his paper to Einstein, who saw the significance of it and used his influence to get it published.[1][2]

A derivation of the Bose–Einstein distribution

Suppose we have a number of energy levels, labeled by index \displaystyle i, each level having energy \displaystyle \varepsilon_i and containing a total of \displaystyle n_i particles. Suppose each level contains \displaystyle g_i distinct sublevels, all of which have the same energy, and which are distinguishable. For example, two particles may have different momenta, in which case they are distinguishable from each other, yet they can still have the same energy. The value of \displaystyle g_i associated with level \displaystyle i is called the "degeneracy" of that energy level. Any number of bosons can occupy the same sublevel.

Let \displaystyle w(n,g) be the number of ways of distributing \displaystyle n particles among the \displaystyle g sublevels of an energy level. There is only one way of distributing \displaystyle n particles with one sublevel, therefore \displaystyle w(n,1)=1. It is easy to see that there are \displaystyle (n+1) ways of distributing \displaystyle n particles in two sublevels which we will write as:


w(n,2)=\frac{(n+1)!}{n!1!}.

With a little thought (See Notes below) it can be seen that the number of ways of distributing \displaystyle n particles in three sublevels is

w(n,3) = w(n,2) + w(n-1,2) + \cdots + w(1,2) + w(0,2)

so that


w(n,3)=\sum_{k=0}^n w(n-k,2) = \sum_{k=0}^n\frac{(n-k+1)!}{(n-k)!1!}=\frac{(n+2)!}{n!2!}

where we have used the following theorem involving binomial coefficients:


\sum_{k=0}^n\frac{(k+a)!}{k!a!}=\frac{(n+a+1)!}{n!(a+1)!}.

Continuing this process, we can see that \displaystyle w(n,g) is just a binomial coefficient (See Notes below)


w(n,g)=\frac{(n+g-1)!}{n!(g-1)!}.

The number of ways that a set of occupation numbers \displaystyle n_i can be realized is the product of the ways that each individual energy level can be populated:


W = \prod_i w(n_i,g_i) =  \prod_i \frac{(n_i+g_i-1)!}{n_i!(g_i-1)!}
\approx\prod_i \frac{(n_i+g_i)!}{n_i!(g_i)!}

where the approximation assumes that g_i \gg 1. Following the same procedure used in deriving the Maxwell–Boltzmann statistics, we wish to find the set of \displaystyle n_i for which \displaystyle W is maximised, subject to the constraint that there be a fixed number of particles, and a fixed energy. The maxima of \displaystyle W and \displaystyle \ln(W) occur at the value of \displaystyle N_i and, since it is easier to accomplish mathematically, we will maximise the latter function instead. We constrain our solution using Lagrange multipliers forming the function:


f(n_i)=\ln(W)+\alpha(N-\sum n_i)+\beta(E-\sum n_i \varepsilon_i)

Using the g_i \gg 1 approximation and using Stirling's approximation for the factorials \left(\ln(x!)\approx x\ln(x)-x\right) gives

f(n_i)=\sum_i (n_i + g_i) \ln(n_i + g_i) - n_i \ln(n_i) - g_i \ln (g_i) +\alpha\left(N-\sum n_i\right)+\beta\left(E-\sum n_i \varepsilon_i\right).

Taking the derivative with respect to \displaystyle n_i, and setting the result to zero and solving for \displaystyle n_i, yields the Bose–Einstein population numbers:


n_i = \frac{g_i}{e^{\alpha+\beta \varepsilon_i}-1}.

It can be shown thermodynamically that \displaystyle \beta = \frac{1}{kT}, where \displaystyle k is Boltzmann's constant and \displaystyle T is the temperature.

It can also be shown that \displaystyle \alpha = - \frac{\mu}{kT}, where \displaystyle \mu is the chemical potential, so that finally:


n_i = \frac{g_i}{e^{(\varepsilon_i-\mu)/kT}-1}.

Note that the above formula is sometimes written:


n_i = \frac{g_i}{e^{\varepsilon_i/kT}/z-1},

where \displaystyle z=\exp(\mu/kT) is the absolute activity.

Notes

A much simpler way to think of Bose–Einstein distribution function is to consider that n particles are denoted by identical balls and g shells are marked by g-1 line partitions. It is clear that the permutations of these n balls and g-1 partitions will give different ways of arranging bosons in different energy levels.

Say, for 3(=n) particles and 3 shells, therefore g=2, the arrangement may be like

|..|. or ||... or |.|.. etc.

Hence the number of distinct permutations of n + (g-1) objects which have n identical items and (g-1) identical items will be:

(n+g-1)!/n!(g-1)!

OR

The purpose of these notes is to clarify some aspects of the derivation of the Bose–Einstein (B–E) distribution for beginners. The enumeration of cases (or ways) in the B–E distribution can be recast as follows. Consider a game of dice throwing in which there are \displaystyle n dice, with each die taking values in the set \displaystyle \left\{ 1, \dots, g \right\}, for g \ge 1. The constraints of the game are that the value of a die \displaystyle i, denoted by \displaystyle m_i, has to be greater than or equal to the value of die \displaystyle (i-1), denoted by \displaystyle m_{i-1}, in the previous throw, i.e., m_i \ge m_{i-1}. Thus a valid sequence of die throws can be described by an n-tuple \displaystyle \left( m_1 , m_2 , \dots , m_n \right), such that m_i \ge m_{i-1}. Let \displaystyle S(n,g) denote the set of these valid n-tuples:


   S(n,g) = 
   \Big\{ 
      \left( m_1 , m_2 , \dots , m_n \right) 
      \Big| \Big.
      m_i \ge m_{i-1} ,
      m_i \in \left\{ 1,  \dots, g \right\} ,
      \forall i = 1, \dots , n 
   \Big\}

(1)

Then the quantity \displaystyle w(n,g) (defined above as the number of ways to distribute \displaystyle n particles among the \displaystyle g sublevels of an energy level) is the cardinality of \displaystyle S(n,g), i.e., the number of elements (or valid n-tuples) in \displaystyle S(n,g). Thus the problem of finding and expression for \displaystyle w(n,g) becomes the problem of counting the elements in \displaystyle S(n,g).

Example n = 4, g = 3:


   S(4,3) =
   \left\{ 
      \underbrace{(1111), (1112), (1113)}_{(a)},
      \underbrace{(1122), (1123), (1133)}_{(b)},
      \underbrace{(1222), (1223), (1233), (1333)}_{(c)},
   \right.

   \left.
      \underbrace{(2222), (2223), (2233), (2333), (3333)}_{(d)}
   \right\}
\displaystyle w(4,3) = 15 (there are \displaystyle 15 elements in \displaystyle S(4,3))

Subset \displaystyle (a) is obtained by fixing all indices \displaystyle m_i to \displaystyle 1, except for the last index, \displaystyle m_n, which is incremented from \displaystyle 1 to \displaystyle g=3. Subset \displaystyle (b) is obtained by fixing \displaystyle m_1 = m_2 = 1, and incrementing \displaystyle m_3 from \displaystyle 2 to \displaystyle g=3. Due to the constraint 
   \displaystyle 
   m_i \ge m_{i-1}
on the indices in \displaystyle S(n,g), the index \displaystyle m_4 must automatically take values in \displaystyle \left\{ 2, 3 \right\}. The construction of subsets \displaystyle (c) and \displaystyle (d) follows in the same manner.

Each element of \displaystyle S(4,3) can be thought of as a multiset of cardinality \displaystyle n=4; the elements of such multiset are taken from the set \displaystyle \left\{ 1, 2, 3 \right\} of cardinality \displaystyle g=3, and the number of such multisets is the multiset coefficient


   \displaystyle 
   \left\langle 
      \begin{matrix} 
	 3 
	 \\ 
	 4 
      \end{matrix}
   \right\rangle 
   = {3 + 4 - 1 \choose 3-1}
   = {3 + 4 - 1 \choose 4}
   =
   \frac
   {6!}
   {4! 2!}
   = 15

More generally, each element of \displaystyle S(n,g) is a multiset of cardinality \displaystyle n (number of dice) with elements taken from the set \displaystyle \left\{ 1, \dots, g \right\} of cardinality \displaystyle g (number of possible values of each die), and the number of such multisets, i.e., \displaystyle w(n,g) is the multiset coefficient


   \displaystyle 
   w(n,g) 
   =
   \left\langle 
      \begin{matrix} 
	 g 
	 \\ 
	 n 
      \end{matrix}
   \right\rangle 
   = {g + n - 1 \choose g-1}
   = {g + n - 1 \choose n}
   = 
   \frac{(g + n - 1)!}
   {n! (g-1)!}

(2)

which is exactly the same as the formula for \displaystyle w(n,g), as derived above with the aid of a theorem involving binomial coefficients, namely


\sum_{k=0}^n\frac{(k+a)!}{k!a!}=\frac{(n+a+1)!}{n!(a+1)!}.

(3)

To understand the decomposition


   \displaystyle 
   w(n,g) 
   =
   \sum_{k=0}^{n}
   w(n-k, g-1)
   =
   w(n, g-1)
   +
   w(n-1, g-1)
   +
   \cdots
   +
   w(1, g-1)
   +
   w(0, g-1)

(4)

or for example, \displaystyle n=4 and \displaystyle g=3


   \displaystyle 
   w(4,3)
   =
   w(4,2)
   +
   w(3,2)
   +
   w(2,2)
   +
   w(1,2)
   +
   w(0,2),

let us rearrange the elements of \displaystyle S(4,3) as follows


   S(4,3) =
   \left\{ 
      \underbrace{
	 (1111), 
	 (1112), 
	 (1122), 
	 (1222), 
	 (2222)
      }_{(\alpha)},
      \underbrace{
	 (111{\color{Red}\underset{=}{3}}),
	 (112{\color{Red}\underset{=}{3}}), 
	 (122{\color{Red}\underset{=}{3}}), 
	 (222{\color{Red}\underset{=}{3}}) 
      }_{(\beta)},
   \right.

   \left.
      \underbrace{
	 (11{\color{Red}\underset{==}{33}}),
	 (12{\color{Red}\underset{==}{33}}), 
	 (22{\color{Red}\underset{==}{33}}) 
      }_{(\gamma)},
      \underbrace{
	 (1{\color{Red}\underset{===}{333}}),
	 (2{\color{Red}\underset{===}{333}}) 
      }_{(\delta)}
      \underbrace{
	 ({\color{Red}\underset{====}{3333}})
      }_{(\omega)}
   \right\}
.

Clearly, the subset \displaystyle (\alpha) of \displaystyle S(4,3) is the same as the set


   \displaystyle 
   S(4,2)
   =
   \left\{ 
	 (1111), 
	 (1112), 
	 (1122), 
	 (1222), 
	 (2222)
   \right\}
.

By deleting the index \displaystyle m_4=3 (shown in red with double underline) in the subset \displaystyle (\beta) of \displaystyle S(4,3), one obtains the set


   \displaystyle 
   S(3,2)
   =
   \left\{ 
	 (111),
	 (112), 
	 (122), 
	 (222) 
   \right\}
.

In other words, there is a one-to-one correspondence between the subset \displaystyle (\beta) of \displaystyle S(4,3) and the set \displaystyle S(3,2). We write


   \displaystyle 
   (\beta)
   \longleftrightarrow
   S(3,2)
.

Similarly, it is easy to see that


   \displaystyle 
   (\gamma)
   \longleftrightarrow
   S(2,2)
   =
   \left\{ 
	 (11),
	 (12), 
	 (22) 
   \right\}

   \displaystyle 
   (\delta)
   \longleftrightarrow
   S(1,2)
   =
   \left\{ 
	 (1),
	 (2) 
   \right\}

   \displaystyle 
   (\omega)
   \longleftrightarrow
   S(0,2)
   =
   \phi
(empty set).

Thus we can write


   \displaystyle 
   S(4,3) 
   =
   \bigcup_{k=0}^{4}
   S(4-k,2)

or more generally,


   \displaystyle 
   S(n,g) 
   =
   \bigcup_{k=0}^{n}
   S(n-k,g-1)
;

(5)

and since the sets


   \displaystyle 
   S(i,g-1) \ , \ {\rm for} \ i = 0, \dots , n

are non-intersecting, we thus have


   \displaystyle 
   w(n,g) 
   =
   \sum_{k=0}^{n}
   w(n-k,g-1)
,

(6)

with the convention that


   \displaystyle 
   w(0,g)
   =
   1 \ , \forall g
   \ ,
   {\rm and}
   \ 
   w(n,0)
   =
   1 \ , \forall n
.
(7)

Continuing the process, we arrive at the following formula


   \displaystyle 
   w(n,g) 
   =
   \sum_{k_1=0}^{n}
   \sum_{k_2=0}^{n-k_1}
   w(n - k_1 - k_2, g-2)
   =
   \sum_{k_1=0}^{n}
   \sum_{k_2=0}^{n-k_1}
   \cdots
   \sum_{k_g=0}^{n-\sum_{j=1}^{g-1} k_j}
   w(n - \sum_{i=1}^{g} k_i, 0).

Using the convention (7)2 above, we obtain the formula


   \displaystyle 
   w(n,g) 
   =
   \sum_{k_1=0}^{n}
   \sum_{k_2=0}^{n-k_1}
   \cdots
   \sum_{k_g=0}^{n-\sum_{j=1}^{g-1} k_j}
   1,

(8)

keeping in mind that for \displaystyle q and \displaystyle p being constants, we have


   \displaystyle 
   \sum_{k=0}^{q}
   p
   =
   q p
.

(9)

It can then be verified that (8) and (2) give the same result for \displaystyle w(4,3), \displaystyle w(3,3), \displaystyle w(3,2), etc.

Information retrieval

In recent years, Bose Einstein statistics have also been used as a method for term weighting in information retrieval. The method is one of a collection of DFR ("Divergence From Randomness") models, the basic notion being that Bose Einstein statistics may be a useful indicator in cases where a particular term and a particular document have a significant relationship that would not have occurred purely by chance. Source code for implementing this model is available from the Terrier project at the University of Glasgow.

See also

Notes

  1. Hey, Anthony J. G.; Walters, Patrick (2003). The New Quantum Universe. London: Cambridge University Press. pp. 139–141. ISBN 0521564573. 
  2. Rigden, John S. (2005). Einstein 1905: The Standard of Greatness. Massachusetts: Harvard University Press. pp. 143,144. ISBN 0674015444. 

References